The FEMTI guidelines for contextual MT evaluation: principles and resources
نویسندگان
چکیده
A large number of evaluation metrics exist for machine translation (MT) systems, but depending on the intended context of use of such a system, not all metrics are equally relevant. Based on the ISO/IEC 9126 and 14598 standards for software evaluation, the Framework for the Evaluation of Machine Translation in ISLE (FEMTI) provides guidelines for the selection of quality characteristics to be evaluated depending on the expected task, users, and input characteristics of an MT system. This approach to contextual evaluation was implemented as a web-based application which helps its users design evaluation plans. In addition, FEMTI offers experts in evaluation the possibility to enter and share their knowledge using a dedicated web-based tool, which have been tested in several evaluation exercises.
منابع مشابه
Improving quality models for MT evaluation based on evaluators’ feedback
The Framework for the Evaluation for Machine Translation (FEMTI) contains guidelines for building a quality model that is used to evaluate MT systems in relation to the purpose and intended context of use of the systems. Contextual quality models can thus be constructed, but entering into FEMTI the knowledge required for this operation is a complex task. An experiment has been set up in order t...
متن کاملTowards Automatic Generation of Evaluation Plans for Context-based MT Evaluation
This report extends the FEMTI guidelines for context-based MT evaluation with new functionalities aimed at evaluators and experts. The proposed interface to FEMTI generates an outline evaluation plan depending on the characteristics of the context in which an MT system will be used, entered by the evaluators. We first summarize the principle of context-based MT evaluation and the initial FEMTI ...
متن کاملFinding the System that Suits you Best: Towards the Normalization of MT Evaluation
The Framework for the Evaluation of Machine Translation, FEMTI, brings together the many disparate metrics and methods which have been devised for MT and helps evaluators to design an evaluation plan based on the context of use intended for the system. FEMTI allows therefore the generation of more standardized and reusable evaluation plans. By evaluators we mean not only developers and programm...
متن کاملReference-based vs. task-based evaluation of human language technology
This paper starts from the ISO distinction of three types of evaluation procedures – internal, external and in use – and proposes to match these types to the three types of human language technology (HLT) systems: analysis, generation, and interactive. The paper explains why internal evaluation is not suitable to measure the qualities of HLT systems, and shows that reference-based external eval...
متن کاملFEMTI: creating and using a framework for MT evaluation
This paper presents FEMTI, a web-based Framework for the Evaluation of Machine Translation in ISLE. FEMTI offers structured descriptions of potential user needs, linked to an overview of technical characteristics of MT systems. The description of possible systems is mainly articulated around the quality characteristics for software product set out in ISO/IEC standard 9126. Following the philoso...
متن کامل